{
  "nbformat": 4,
  "nbformat_minor": 0,
  "metadata": {
    "colab": {
      "name": "W7_Homework",
      "provenance": [],
      "collapsed_sections": [],
      "toc_visible": true,
      "authorship_tag": "ABX9TyNzOabK1XpdMGq27vgSKCgu",
      "include_colab_link": true
    },
    "kernelspec": {
      "name": "python3",
      "display_name": "Python 3"
    },
    "accelerator": "GPU"
  },
  "cells": [
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "view-in-github",
        "colab_type": "text"
      },
      "source": [
        "<a href=\"https://colab.research.google.com/github/CIS-522/course-content/blob/main/tutorials/W07_Vision_TL/W7_Homework.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "KNU5GiXY_Dq6"
      },
      "source": [
        "# CIS-522 Week 7 Homework\n",
        "\n",
        "\n",
        "**Instructor:** Konrad Kording\n",
        "\n",
        "**Content Creator:** Ben Heil"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "szKIY9I-_Qa-"
      },
      "source": [
        "---\n",
        "## Preface\n",
        "Since this week's homework requires coding, we recommend saving this notebook in your google Drive (`File -> Save a copy in Drive`), and share the link to the final version in the subscription airtable form. You can also attach the code to the form if you prefer off-colab coding."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "vAdEW6Eq-8IN",
        "cellView": "form"
      },
      "source": [
        "#@markdown What is your Pennkey and pod? (text, not numbers, e.g. bfranklin)\n",
        "my_pennkey = '' #@param {type:\"string\"}\n",
        "my_pod = 'Select' #@param ['Select', 'euclidean-wombat', 'sublime-newt', 'buoyant-unicorn', 'lackadaisical-manatee','indelible-stingray','superfluous-lyrebird','discreet-reindeer','quizzical-goldfish','astute-jellyfish','ubiquitous-cheetah','nonchalant-crocodile','fashionable-lemur','spiffy-eagle','electric-emu','quotidian-lion', 'quantum-herring']\n",
        "\n",
        "import time\n",
        "t0 = time.time()"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "3tBIYJK0vr6z"
      },
      "source": [
        "## Setup"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "Rq1asJ9vvseM"
      },
      "source": [
        "# imports\n",
        "import time\n",
        "\n",
        "import matplotlib.pyplot as plt\n",
        "import numpy as np\n",
        "import torch\n",
        "import torch.nn as nn\n",
        "import torch.nn.functional as F\n",
        "import torch.optim as optim\n",
        "import tqdm\n",
        "from IPython import display\n",
        "from torchvision import transforms\n",
        "from torchvision.datasets import ImageFolder\n",
        "from torch.utils.data import DataLoader"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "jWTtTWU_rocl"
      },
      "source": [
        "device = torch.device(\"cuda\" if torch.cuda.is_available() else \"cpu\")"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "FiL8t746JGKp"
      },
      "source": [
        "## Part 1: Reading\n",
        "For this week's reading portion, we'll have you read the [original AlexNet paper](https://papers.nips.cc/paper/2012/file/c399862d3b9d6b76c8436e924a68c45b-Paper.pdf). As you read it, be sure to notice how many features of modern state of the art networks are already present in AlexNet. For example, they use ReLU for their activation function, use data augmentation, and learning rate decay. They also foreshadow the successes and challenges of deep learning with the line \"Thus\n",
        "far, our results have improved as we have made our network larger and trained it longer but we still\n",
        "have many orders of magnitude to go in order to match the infero-temporal pathway of the human\n",
        "visual system\". \n",
        "\n",
        "A side note on the growth of compute power over time: AlexNet was trained on two Nvidia GTX 580s, which were top of the line graphics cards for their time. Their theoretical processing speed is roughly [1.6 Trillion](https://www.techpowerup.com/gpu-specs/geforce-gtx-580.c270) Floating point Operations Per Second (TFLOPS). This year's iPhone graphics cards have a theoretical speed of [.8 TFLOPS](https://www.cpu-monkey.com/en/cpu-apple_a14_bionic-1693). That is to say that if you were to go back in time ten years with four iPhones, you could conduct cutting edge deep learning research with them."
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "oWYaoS6MKDOS"
      },
      "source": [
        "To what extent have the gains in computer vision performance over the past ten years been due to new ideas as opposed to increases in computational power enabling the scaling of old ideas?"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "qRIV7m6zJ8a2",
        "cellView": "form"
      },
      "source": [
        "developments = '' #@param {type:\"string\"}"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "oLT8niDvPqdP"
      },
      "source": [
        "## Part 2: Implementation\n",
        "As it turns out, AlexNet is representative of how many convolutional neural networks are implemented. We'll have you implement [version 2 of AlexNet](https://arxiv.org/abs/1404.5997) so you don't have to worry about paralellizing across GPUs.\n",
        "\n",
        "Because the architecture details in the paper are largely left in footnotes, we'll provide it below:\n",
        "\n",
        "Convolutional Block:\n",
        "\n",
        "1.   Five [convolutional layers](https://pytorch.org/docs/stable/generated/torch.nn.Conv2d.html) with output filter counts of 64, 192, 384, 256, and 256 respectively. Their associated kernel sizes are 11, 5, 3, 3, 3. Their strides are 4, 1, 1, 1, 1. Finally, the first two layers have a padding of 2, while the last three have a padding of 1.\n",
        "2.   [Max pooling](https://pytorch.org/docs/stable/generated/torch.nn.MaxPool2d.html) layers after the first, second, and last layer. Each pooling step has a kernel of size 3 and a stride of 2.\n",
        "3.   ReLU nonlinearities after each layer\n",
        "\n",
        "Average Pooling:  \n",
        "After the convolutional block but before the classifier block, there is an [average pooling](https://pytorch.org/docs/stable/generated/torch.nn.AdaptiveAvgPool2d.html) layer with an output shape of (6,6).\n",
        "\n",
        "Classifier Block:\n",
        "\n",
        "1.   The model uses three [fully-connected](https://pytorch.org/docs/stable/generated/torch.nn.Linear.html) layers. Their outputs are all of size 4096, 4096, and 10.\n",
        "2.   There are [dropout layers](https://pytorch.org/docs/stable/generated/torch.nn.Dropout.html) with probability .5 before each of the first two fully connected layers.\n",
        "3.   Finally, there is a ReLU after each fully-connected layer except the last one.\n",
        "\n",
        "\n",
        "Note:  \n",
        "When developing neural networks the size of each layer is an architecture choice, but the input to each layer is dependent solely on the input data or the output of the previous layer. To simulate this reality, we haven't provided the input shapes that will be required for the convolutional and fully-connected layers.\n",
        "\n",
        "Tips:\n",
        "- If you are unsure of what size the input should be, the shape of the output for each layer can be found in the `forward` function using the line `print(<prev_output>.shape)`\n",
        "- In Pytorch image tensors have the shape (batch size, channel count, image height, image width)\n",
        "- Conv layers expeect image tensors, but fully-connected layers expect flattened tensors (tensors with shape (batch size, dimension). You'll need to convert the shape at some point using [\\<output_tensor\\>.view(-1, dimension)](https://pytorch.org/docs/stable/tensors.html#torch.Tensor.view) or [torch.flatten](https://pytorch.org/docs/stable/generated/torch.flatten.html).\n",
        "\n"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "Tc-5NGN7RgNQ"
      },
      "source": [
        "class AlexNet(nn.Module):\n",
        "    \"\"\"\n",
        "    Implementation from https://github.com/pytorch/vision/blob/master/torchvision/models/alexnet.py\n",
        "    \"\"\"\n",
        "    def __init__(self, num_classes: int = 1000) -> None:\n",
        "        super(AlexNet, self).__init__()\n",
        "        ...\n",
        "\n",
        "    def forward(self, x: torch.Tensor) -> torch.Tensor:\n",
        "        ..."
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "xwtROSjPXuxR"
      },
      "source": [
        "## Part 3: Evaluation\n",
        "Test how your model predicts Imagenette classes"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "9L76OdyxjLbx",
        "cellView": "form"
      },
      "source": [
        "# @title Download imagenette\n",
        "!rm -r imagenette*\n",
        "!wget https://s3.amazonaws.com/fast-ai-imageclas/imagenette2-320.tgz\n",
        "!tar -xf imagenette2-320.tgz\n",
        "!rm -r imagenette2-320.tgz"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "w_RnqY64qcpL",
        "cellView": "form"
      },
      "source": [
        "# @title Prepare Imagenette Data\n",
        "val_transform = transforms.Compose((transforms.Resize((224, 224)),\n",
        "                                     transforms.ToTensor()))    \n",
        "\n",
        "imagenette_val = ImageFolder('imagenette2-320/val', transform=val_transform)\n",
        "\n",
        "train_transform = transforms.Compose((transforms.Resize((224, 224)),\n",
        "                                     transforms.ToTensor()))    \n",
        "\n",
        "imagenette_train = ImageFolder('imagenette2-320/train', transform=train_transform)"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "uvieUWmzo5Aj"
      },
      "source": [
        "imagenette_train_loader = torch.utils.data.DataLoader(imagenette_train, \n",
        "                                                      batch_size=16,\n",
        "                                                      shuffle=True)\n",
        "\n",
        "imagenette_val_loader = torch.utils.data.DataLoader(imagenette_val, \n",
        "                                                    batch_size=16, \n",
        "                                                    shuffle=True)"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "Ev7Cmvy4TFBP"
      },
      "source": [
        "### Test Model Training\n",
        "\n",
        "One good way to make sure your model is performing correctly is to train it to overfit a single batch. If this loop works correctly, then the model is at least able to update its weights and produce an output."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "VRtSsj7mSckZ"
      },
      "source": [
        "# Overfit a single batch\n",
        "\n",
        "alexnet = AlexNet().to(device)\n",
        "optimizer = optim.Adam(alexnet.parameters(), lr=1e-4)\n",
        "loss_fn = torch.nn.CrossEntropyLoss()\n",
        "\n",
        "batch = next(iter(imagenette_train_loader))\n",
        "\n",
        "accuracies = []\n",
        "for epoch in range(200):\n",
        "\n",
        "    # Train loop\n",
        "    alexnet.train()\n",
        "    images, labels = batch\n",
        "    images = images.to(device)\n",
        "    labels = labels.to(device)\n",
        "    \n",
        "    optimizer.zero_grad()\n",
        "    output = alexnet(images)\n",
        "    loss = loss_fn(output, labels)\n",
        "    loss.backward()\n",
        "    optimizer.step()\n",
        "\n",
        "    total_correct = 0\n",
        "    predictions = torch.argmax(output, dim=1)\n",
        "    num_correct = torch.sum(predictions == labels)\n",
        "    total_correct += num_correct\n",
        "\n",
        "    # Visualize accuracy\n",
        "    accuracy = total_correct / labels.shape[0]\n",
        "    accuracies.append(accuracy.item())\n",
        "    if epoch % 10 == 0:\n",
        "        plt.plot(accuracies)\n",
        "        plt.xlabel('epoch')\n",
        "        plt.ylabel('accuracy')\n",
        "        plt.title('Alexnet Prediction Accuracy')\n",
        "        display.clear_output(wait=True)\n",
        "        display.display(plt.gcf())"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "MR1Xei8iUZco"
      },
      "source": [
        "### Evaluate Model\n",
        "\n",
        "This cell contains the actual training and evaluation loops for the ImageNette subset we're using. It should take around ten minutes total to run, and will show your model's validation set accuracy increasing over time."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "UDWluyWxpgSg"
      },
      "source": [
        "alexnet = AlexNet().to(device)\n",
        "optimizer = optim.Adam(alexnet.parameters(), lr=1e-4)\n",
        "loss_fn = torch.nn.CrossEntropyLoss()\n",
        "\n",
        "accuracies = []\n",
        "for epoch in range(10):\n",
        "\n",
        "    # Train loop\n",
        "    alexnet.train()\n",
        "    for batch in tqdm.notebook.tqdm(imagenette_train_loader):\n",
        "        images, labels = batch\n",
        "        images = images.to(device)\n",
        "        labels = labels.to(device)\n",
        "        \n",
        "        optimizer.zero_grad()\n",
        "        output = alexnet(images)\n",
        "        loss = loss_fn(output, labels)\n",
        "        loss.backward()\n",
        "        optimizer.step()\n",
        "\n",
        "    # Eval loop\n",
        "    alexnet.eval()\n",
        "    total_correct = 0\n",
        "    for batch in tqdm.notebook.tqdm(imagenette_val_loader):\n",
        "        image, labels = batch\n",
        "        image = image.to(device)\n",
        "        labels = labels.to(device)\n",
        "        \n",
        "        output = alexnet(image)\n",
        "        predictions = torch.argmax(output, dim=1)\n",
        "        num_correct = torch.sum(predictions == labels)\n",
        "        total_correct += num_correct\n",
        "\n",
        "    # Visualize accuracy\n",
        "    accuracy = total_correct / len(imagenette_val)\n",
        "    accuracies.append(accuracy.item())\n",
        "    plt.plot(accuracies)\n",
        "    plt.xlabel('epoch')\n",
        "    plt.ylabel('accuracy')\n",
        "    plt.title('Alexnet Prediction Accuracy')\n",
        "    display.clear_output(wait=True)\n",
        "    display.display(plt.gcf())"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "hU4hxgGDz-0M"
      },
      "source": [
        "## Part 4: Ethics\n",
        "Read [this article](https://www.nytimes.com/2019/11/19/technology/artificial-intelligence-bias.html) from the NYT giving a number of perspectives on combating bias in AI. While the the technical problems they bring up e.g. unbalanced datasets can be addressed with technical solutions, the interviewees also discuss social and societal issues which are harder to solve.\n",
        "\n",
        "What group, entity, or person do you think should be responsible for ensuring computer vision systems don't harm more people than they help them, and why?\n",
        "\n",
        "Write your ~200-300 word opinion below, and post it to your pod's slack for discussion."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "cellView": "form",
        "id": "RGkOIWof5PbP"
      },
      "source": [
        "vision_enforcement = '' #@param {type:\"string\"}"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "EL1erVZJ9BRu"
      },
      "source": [
        "## Part 5: Get To Know Your Pod\n",
        "\n",
        "Talk with at least two of your podmates and ask what did they wanted to do when they were growing up. Have things changed since then?"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "cellView": "form",
        "id": "pg2kkx759oUm"
      },
      "source": [
        "growing_up = '' #@param {type:\"string\"}"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "vzmLwDj5MHoi"
      },
      "source": [
        "# Submission\n",
        "\n",
        "Once you're done, click on 'Share' and add the link to the box below."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "3rmcJEPqMNa3",
        "cellView": "form"
      },
      "source": [
        "link = '' #@param {type:\"string\"}"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "p7JBMPKNMdFE",
        "cellView": "form"
      },
      "source": [
        "import time\n",
        "import numpy as np\n",
        "import urllib.parse\n",
        "from IPython.display import IFrame\n",
        "\n",
        "t7 = time.time()\n",
        "\n",
        "#@markdown #Run Cell to Show Airtable Form\n",
        "#@markdown ##**Confirm your answers and then click \"Submit\"**\n",
        "\n",
        "\n",
        "def prefill_form(src, fields: dict):\n",
        "  '''\n",
        "  src: the original src url to embed the form\n",
        "  fields: a dictionary of field:value pairs,\n",
        "  e.g. {\"pennkey\": my_pennkey, \"location\": my_location}\n",
        "  '''\n",
        "  prefill_fields = {}\n",
        "  for key in fields:\n",
        "      new_key = 'prefill_' + key\n",
        "      prefill_fields[new_key] = fields[key]\n",
        "  prefills = urllib.parse.urlencode(prefill_fields)\n",
        "  src = src + prefills\n",
        "  return src\n",
        "\n",
        "\n",
        "\n",
        "#autofill fields if they are not present\n",
        "#a missing pennkey and pod will result in an Airtable warning\n",
        "#which is easily fixed user-side.\n",
        "try: my_pennkey;\n",
        "except NameError: my_pennkey = \"\"\n",
        "\n",
        "try: my_pod;\n",
        "except NameError: my_pod = \"Select\"\n",
        "\n",
        "try: developments;\n",
        "except NameError: developments = \"\"\n",
        "\n",
        "try: link;\n",
        "except NameError: link = \"\"\n",
        "\n",
        "try: vision_enforcement;\n",
        "except NameError: vision_enforcement = \"\"\n",
        "\n",
        "try: growing_up;\n",
        "except NameError: growing_up = \"\"\n",
        "\n",
        "\n",
        "fields = {\"pennkey\": my_pennkey,\n",
        "          \"pod\": my_pod,\n",
        "          \"developments\": developments,\n",
        "          \"link\": link,\n",
        "          \"growing_up\": growing_up,\n",
        "          \"vision_enforcement\": vision_enforcement}\n",
        "\n",
        "src = \"https://airtable.com/embed/shrdREwKlfCi2BWex?\"\n",
        "\n",
        "\n",
        "#now instead of the original source url, we do: src = prefill_form(src, fields)\n",
        "display.display(IFrame(src = prefill_form(src, fields), width = 800, height = 400))"
      ],
      "execution_count": null,
      "outputs": []
    }
  ]
}